self-supervised world model
r/MachineLearning - [D] Paper Explained - Planning to Explore via Self-Supervised World Models
What can an agent do without any reward? While many formulations of intrinsic rewards exist (Curiosity, Novelty, etc.), they all look back in time to learn. Plan2Explore is the first model that uses planning in a learned imaginary latent world model to seek out states where it is uncertain about what will happen.